AITopics | model assessment

Collaborating Authors

model assessment

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Probabilistic Models for Integration Error in the Assessment of Functional Cardiac Models

Chris Oates, Steven Niederer, Angela Lee, François-Xavier Briol, Mark Girolami

Neural Information Processing SystemsNov-21-2025, 11:56:42 GMT

The computational model can be assessed through comparison of these predictions to test data generated from an experiment.

assessment, eqn, numerical error, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Long Beach (0.04)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)

Add feedback

Probabilistic Models for Integration Error in the Assessment of Functional Cardiac Models

Chris Oates, Steven Niederer, Angela Lee, François-Xavier Briol, Mark Girolami

Neural Information Processing SystemsOct-3-2024, 23:22:56 GMT

This paper studies the numerical computation of integrals, representing estimates or predictions, over the output f(x) of a computational model with respect to a distribution p(dx) over uncertain inputs x to the model. For the functional cardiac models that motivate this work, neither f nor p possess a closed-form expression and evaluation of either requires 100 CPU hours, precluding standard numerical integration methods. Our proposal is to treat integration as an estimation problem, with a joint model for both the a priori unknown function f and the a priori unknown distribution p. The result is a posterior distribution over the integral that explicitly accounts for dual sources of numerical approximation error due to a severely limited computational budget. This construction is applied to account, in a statistically principled manner, for the impact of numerical errors that (at present) are confounding factors in functional cardiac model assessment.

artificial intelligence, machine learning, numerical error, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Long Beach (0.04)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)

Add feedback

Model Assessment and Selection under Temporal Distribution Shift

Han, Elise, Huang, Chengpiao, Wang, Kaizheng

arXiv.org Artificial IntelligenceFeb-13-2024

Statistical learning theory is traditionally founded on the assumption of a static data distribution, where statistical models are trained and deployed in the same environment. However, this assumption is often violated in practice, where the data distribution keeps changing over time. The temporal distribution shift can lead to serious decline in model performance post-deployment, which underlines the critical need to monitor models and detect potential degradation. Moreover, one often needs to choose among multiple candidate models originating from different learning algorithms (e.g., linear regression, random forests, neural networks) and hyperparameters (e.g., penalty parameter, step size, time window for training). Temporal distribution shift poses a major challenge to model selection, as past performance may not reliably predict future outcomes. Learners usually have to work with limited data from the current time period and abundant historical data, whose distributions may vary significantly.

distribution shift, lemma 3, probability, (16 more...)

arXiv.org Artificial Intelligence

2402.08672

Country:

Asia > Middle East > UAE > Dubai Emirate > Dubai (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(3 more...)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

On Leakage in Machine Learning Pipelines

Sasse, Leonard, Nicolaisen-Sobesky, Eliana, Dukart, Juergen, Eickhoff, Simon B., Götz, Michael, Hamdan, Sami, Komeyer, Vera, Kulkarni, Abhijit, Lahnakoski, Juha, Love, Bradley C., Raimondo, Federico, Patil, Kaustubh R.

arXiv.org Artificial IntelligenceNov-7-2023

Machine learning (ML) provides powerful tools for predictive modeling. ML's popularity stems from the promise of sample-level prediction with applications across a variety of fields from physics and marketing to healthcare. However, if not properly implemented and evaluated, ML pipelines may contain leakage typically resulting in overoptimistic performance estimates and failure to generalize to new data. This can have severe negative financial and societal implications. Our aim is to expand understanding associated with causes leading to leakage when designing, implementing, and evaluating ML pipelines. Illustrated by concrete examples, we provide a comprehensive overview and discussion of various types of leakage that may arise in ML pipelines.

leakage, pipeline, prediction, (17 more...)

arXiv.org Artificial Intelligence

2311.04179

Country:

North America > United States > New York > New York County > New York City (0.14)
Europe > United Kingdom > England > Greater London > London (0.04)
Europe > Germany > North Rhine-Westphalia > Düsseldorf Region > Düsseldorf (0.04)
(5 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Spatial machine-learning model diagnostics: a model-agnostic distance-based approach

Brenning, Alexander

arXiv.org Artificial IntelligenceNov-12-2021

While significant progress has been made towards explaining black-box machine-learning (ML) models, there is still a distinct lack of diagnostic tools that elucidate the spatial behaviour of ML models in terms of predictive skill and variable importance. This contribution proposes spatial prediction error profiles (SPEPs) and spatial variable importance profiles (SVIPs) as novel model-agnostic assessment and interpretation tools for spatial prediction models with a focus on prediction distance. Their suitability is demonstrated in two case studies representing a regionalization task in an environmental-science context, and a classification task from remotely-sensed land cover classification. In these case studies, the SPEPs and SVIPs of geostatistical methods, linear models, random forest, and hybrid algorithms show striking differences but also relevant similarities. Limitations of related cross-validation techniques are outlined, and the case is made that modelers should focus their model assessment and interpretation on the intended spatial prediction horizon. The range of autocorrelation, in contrast, is not a suitable criterion for defining spatial cross-validation test sets. The novel diagnostic tools enrich the toolkit of spatial data science, and may improve ML model interpretation, selection, and design.

artificial intelligence, machine learning, spatial reasoning, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1080/13658816.2022.2131789

2111.08478

Country: Europe (0.46)

Genre: Research Report > Promising Solution (0.34)

Industry: Energy > Oil & Gas > Upstream (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Mlr3spatiotempcv: Spatiotemporal resampling methods for machine learning in R

Schratz, Patrick, Becker, Marc, Lang, Michel, Brenning, Alexander

arXiv.org Machine LearningOct-25-2021

Spatial and spatiotemporal prediction tasks are common in applications ranging from environmental sciences to archaeology and epidemiology. While sophisticated mathematical frameworks have long been developed in spatial statistics to characterize predictive uncertainties under well-defined mathematical assumptions such as intrinsic stationarity (e.g., Cressie 1993), computational estimation procedures have only been proposed more recently to assess predictive performances of spatial and spatiotemporal prediction models (Brenning 2005, 2012; Pohjankukka, Pahikkala, Nevalainen, and Heikkonen 2017; Roberts, Bahn, Ciuti, Boyce, Elith, Guillera-Arroita, Hauenstein, Lahoz-Monfort, Schröder, Thuiller, Warton, Wintle, Hartig, and Dormann 2017). Although alternatives such as the bootstrap exist since some decades (Efron and Gong 1983; Hand 1997), cross-validation (CV) is a particularly well-established, easy-to-implement algorithm for model assessment of supervised machine-learning models (Efron and Gong 1983, and next section) and model selection (Arlot and Celisse 2010). In its basic form, CV is based on resampling the data without paying attention to any possible dependence structure, which may arise from, e.g., grouped or structured data, or underlying environmental processes inducing some sort of spatial coherence at the landscape scale. In treating dependent observations as independent, or ignoring autocorrelation, CV test samples may in fact be heavily correlated with, or even pseudo-replicates of, the data used for training the model, which introduces a potentially severe bias in assessing the transferability of flexible machine-learning (ML) models.

model assessment, partition, spatiotemporal, (14 more...)

arXiv.org Machine Learning

2110.12674

Country:

North America > United States > New York (0.04)
Europe > Spain > Aragón (0.04)
Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)
(4 more...)

Genre: Research Report (0.50)

Industry:

Energy (0.46)
Food & Agriculture > Agriculture (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)

Add feedback

Approximate Cross-validation: Guarantees for Model Assessment and Selection

Wilson, Ashia, Kasy, Maximilian, Mackey, Lester

arXiv.org Machine LearningMar-1-2020

Cross-validation (CV) is a popular approach for assessing and selecting predictive models. However, when the number of folds is large, CV suffers from a need to repeatedly refit a learning procedure on a large number of training datasets. Recent work in empirical risk minimization (ERM) approximates the expensive refitting with a single Newton step warm-started from the full training set optimizer. While this can greatly reduce runtime, several open questions remain including whether these approximations lead to faithful model selection and whether they are suitable for non-smooth objectives. We address these questions with three main contributions: (i) we provide uniform non-asymptotic, deterministic model assessment guarantees for approximate CV; (ii) we show that (roughly) the same conditions also guarantee model selection performance comparable to CV; (iii) we provide a proximal Newton extension of the approximate CV framework for non-smooth prediction problems and develop improved assessment guarantees for problems such as l1-regularized ERM.

acv, assump, proxacv, (15 more...)

arXiv.org Machine Learning

2003.00617

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(4 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (0.62)

Add feedback

A Gentle Introduction to Model Selection for Machine Learning

#artificialintelligenceDec-1-2019, 22:32:09 GMT

Model selection is the process of selecting one final machine learning model from among a collection of candidate machine learning models for a training dataset. Model selection is a process that can be applied both across different types of models (e.g.

model selection, selection, training dataset, (10 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.31)

Add feedback

Structural modeling using overlapped group penalties for discovering predictive biomarkers for subgroup analysis

Ma, Chong, Deng, Wenxuan, Ma, Shuangge, Liu, Ray, Galinsky, Kevin

arXiv.org Machine LearningApr-25-2019

The identification of predictive biomarkers from a large scale of covariates for subgroup analysis has attracted fundamental attention in medical research. In this article, we propose a generalized penalized regression method with a novel penalty function, for enforcing the hierarchy structure between the prognostic and predictive effects, such that a nonzero predictive effect must induce its ancestor prognostic effects being nonzero in the model. Our method is able to select useful predictive biomarkers by yielding a sparse, interpretable, and predictable model for subgroup analysis, and can deal with different types of response variable such as continuous, categorical, and time-to-event data. We show that our method is asymptotically consistent under some regularized conditions. To minimize the generalized penalized regression model, we propose a novel integrative optimization algorithm by integrating the majorization-minimization and the alternating direction method of multipliers, which is named after \texttt{smog}. The enriched simulation study and real case study demonstrate that our method is very powerful for discovering the true predictive biomarkers and identifying subgroups of patients.

artificial intelligence, machine learning, predictive effect, (17 more...)

arXiv.org Machine Learning

1904.11648

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Connecticut > New Haven County > New Haven (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Probabilistic Models for Integration Error in the Assessment of Functional Cardiac Models

Oates, Chris, Niederer, Steven, Lee, Angela, Briol, François-Xavier, Girolami, Mark

Neural Information Processing SystemsDec-31-2017

This paper studies the numerical computation of integrals, representing estimates or predictions, over the output $f(x)$ of a computational model with respect to a distribution $p(\mathrm{d}x)$ over uncertain inputs $x$ to the model. For the functional cardiac models that motivate this work, neither $f$ nor $p$ possess a closed-form expression and evaluation of either requires $\approx$ 100 CPU hours, precluding standard numerical integration methods. Our proposal is to treat integration as an estimation problem, with a joint model for both the a priori unknown function $f$ and the a priori unknown distribution $p$. The result is a posterior distribution over the integral that explicitly accounts for dual sources of numerical approximation error due to a severely limited computational budget. This construction is applied to account, in a statistically principled manner, for the impact of numerical errors that (at present) are confounding factors in functional cardiac model assessment.

artificial intelligence, machine learning, numerical error, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)

Add feedback